{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using SDBN Click Model To Overcome Position Bias\n",
    "\n",
    "This section we use the _Simplified Dynamic Bayesian Network_ (SDBN) to overcome the position bias that we saw with direct Click-Through-Rate. We consider the SDBN judgments and how they compare to just the click through rate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.path.append('../..')\n",
    "from ltr.sdbn_functions import all_sessions, get_sessions\n",
    "from aips import fetch_products, render_judged\n",
    "import pandas\n",
    "# if using a Jupyter notebook, includue:\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "sessions = all_sessions()\n",
    "products = fetch_products(doc_ids=sessions['doc_id'].unique())\n",
    "\n",
    "def print_dataframe(dataframe):\n",
    "    pandas.reset_option(\"all\")\n",
    "    merged = dataframe.merge(products[[\"upc\", \"name\"]], left_on='doc_id', right_on='upc', how='left')\n",
    "    print(merged.rename(columns={\"upc\": \"doc_id\"}).set_index(\"doc_id\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Listing 11.7\n",
    "\n",
    "Click models overcome position bias by learning an examine probability on each ranking. SDBN tracks examines relative to the the last click. This code marks last click position per session so we can compute examine probabilities."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "#%load -s calculate_examine_probability ../ltr/sdbn_functions.py\n",
    "def calculate_examine_probability(sessions):\n",
    "    last_click_per_session = sessions.groupby(\n",
    "        [\"clicked\", \"sess_id\"])[\"rank\"].max()[True]\n",
    "    sessions[\"last_click_rank\"] = last_click_per_session\n",
    "    sessions[\"examined\"] = \\\n",
    "      sessions[\"rank\"] <= sessions[\"last_click_rank\"]\n",
    "    return sessions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "         query  rank        doc_id  clicked  last_click_rank  examined\n",
      "sess_id                                                               \n",
      "3        dryer   0.0   12505451713    False              9.0      True\n",
      "3        dryer   1.0   84691226727    False              9.0      True\n",
      "3        dryer   2.0  883049066905    False              9.0      True\n",
      "3        dryer   3.0   48231011396    False              9.0      True\n",
      "3        dryer   4.0   74108056764    False              9.0      True\n",
      "3        dryer   5.0   77283045400    False              9.0      True\n",
      "3        dryer   6.0  783722274422    False              9.0      True\n",
      "3        dryer   7.0  665331101927    False              9.0      True\n",
      "3        dryer   8.0   14381196320     True              9.0      True\n",
      "3        dryer   9.0   74108096487     True              9.0      True\n",
      "3        dryer  10.0   74108007469    False              9.0     False\n",
      "3        dryer  11.0   12505525766    False              9.0     False\n",
      "3        dryer  12.0   48231011402    False              9.0     False\n",
      "3        dryer  13.0   84691226703    False              9.0     False\n",
      "3        dryer  14.0  883929085118    False              9.0     False\n",
      "3        dryer  15.0   36725561977    False              9.0     False\n",
      "3        dryer  16.0   36725578241    False              9.0     False\n",
      "3        dryer  17.0   12505527456    False              9.0     False\n",
      "3        dryer  18.0   36172950027    False              9.0     False\n",
      "3        dryer  19.0  856751002097    False              9.0     False\n"
     ]
    }
   ],
   "source": [
    "sessions = get_sessions(\"dryer\")\n",
    "probablity = calculate_examine_probability(sessions).loc[3]\n",
    "print(probablity)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Listing 11.8\n",
    "\n",
    "Aggregate clicks and examine counts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "#%load -s calculate_clicked_examined ../ltr/sdbn_functions.py\n",
    "def calculate_clicked_examined(sessions):\n",
    "    sessions = calculate_examine_probability(sessions)\n",
    "    return sessions[sessions[\"examined\"]] \\\n",
    "        .groupby(\"doc_id\")[[\"clicked\", \"examined\"]].sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "              clicked  examined                                 name\n",
      "doc_id                                                              \n",
      "12505451713       355      2707  Frigidaire - Semi-Rigid Dryer Ve...\n",
      "12505525766       268       974  Smart Choice - 6' 30 Amp 3-Prong...\n",
      "12505527456       110       428  Smart Choice - 1/2\" Safety+PLUS ...\n",
      "14381196320       217      1202             The Mind Snatchers - DVD\n",
      "36172950027        97       971  Tools in the Dryer: A Rarities C...\n",
      "36725561977       119       572  Samsung - 3.5 Cu. Ft. 6-Cycle Hi...\n",
      "36725578241       130       477  Samsung - 7.3 Cu. Ft. 7-Cycle El...\n",
      "48231011396       166       423  LG - 3.5 Cu. Ft. 7-Cycle High-Ef...\n",
      "48231011402       213       818  LG - 7.1 Cu. Ft. 7-Cycle Electri...\n",
      "74108007469       208       708  Conair - 1875-Watt Folding Handl...\n",
      "74108056764       273      1791  Conair - Infiniti Ionic Cord-Kee...\n",
      "74108096487       235      1097  Conair - Infiniti Cord-Keeper Pr...\n",
      "77283045400       276      1625      Hello Kitty - Hair Dryer - Pink\n",
      "84691226703       408      2015  Hotpoint - 6.0 Cu. Ft. 3-Cycle E...\n",
      "84691226727       804      2541  GE - 6.0 Cu. Ft. 3-Cycle Electri...\n",
      "665331101927      270      1347            Everything in Static - CD\n",
      "783722274422      288      1498  The Independent - Widescreen Sub...\n",
      "856751002097      133       323     Practecol - Dryer Balls (2-Pack)\n",
      "883049066905      286      2138   Whirlpool - Affresh Washer Cleaner\n",
      "883929085118       44       578  A Charlie Brown Christmas - AC3 ...\n"
     ]
    }
   ],
   "source": [
    "sessions = get_sessions(\"dryer\")\n",
    "clicked_examined_data = calculate_clicked_examined(sessions)\n",
    "print_dataframe(clicked_examined_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Listing 11.9\n",
    "\n",
    "We compute a grade - a probability of relevance - by dividing the clicks by examines. This is the kind of dynamic 'click thru rate' of SDBN, that accounts for whether the result was actually seen by users, not just whether it was shown on the screen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "#%load -s calculate_grade ../ltr/sdbn_functions.py\n",
    "def calculate_grade(sessions):\n",
    "    sessions = calculate_clicked_examined(sessions)\n",
    "    sessions[\"grade\"] = sessions[\"clicked\"] / sessions[\"examined\"]\n",
    "    return sessions.sort_values(\"grade\", ascending=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "              clicked  examined     grade                                 name\n",
      "doc_id                                                                        \n",
      "856751002097      133       323  0.411765     Practecol - Dryer Balls (2-Pack)\n",
      "48231011396       166       423  0.392435  LG - 3.5 Cu. Ft. 7-Cycle High-Ef...\n",
      "84691226727       804      2541  0.316411  GE - 6.0 Cu. Ft. 3-Cycle Electri...\n",
      "74108007469       208       708  0.293785  Conair - 1875-Watt Folding Handl...\n",
      "12505525766       268       974  0.275154  Smart Choice - 6' 30 Amp 3-Prong...\n",
      "36725578241       130       477  0.272537  Samsung - 7.3 Cu. Ft. 7-Cycle El...\n",
      "48231011402       213       818  0.260391  LG - 7.1 Cu. Ft. 7-Cycle Electri...\n",
      "12505527456       110       428  0.257009  Smart Choice - 1/2\" Safety+PLUS ...\n",
      "74108096487       235      1097  0.214221  Conair - Infiniti Cord-Keeper Pr...\n",
      "36725561977       119       572  0.208042  Samsung - 3.5 Cu. Ft. 6-Cycle Hi...\n",
      "84691226703       408      2015  0.202481  Hotpoint - 6.0 Cu. Ft. 3-Cycle E...\n",
      "665331101927      270      1347  0.200445            Everything in Static - CD\n",
      "783722274422      288      1498  0.192256  The Independent - Widescreen Sub...\n",
      "14381196320       217      1202  0.180532             The Mind Snatchers - DVD\n",
      "77283045400       276      1625  0.169846      Hello Kitty - Hair Dryer - Pink\n",
      "74108056764       273      1791  0.152429  Conair - Infiniti Ionic Cord-Kee...\n",
      "883049066905      286      2138  0.133770   Whirlpool - Affresh Washer Cleaner\n",
      "12505451713       355      2707  0.131141  Frigidaire - Semi-Rigid Dryer Ve...\n",
      "36172950027        97       971  0.099897  Tools in the Dryer: A Rarities C...\n",
      "883929085118       44       578  0.076125  A Charlie Brown Christmas - AC3 ...\n"
     ]
    }
   ],
   "source": [
    "query = \"dryer\"\n",
    "sessions = get_sessions(query)\n",
    "grade_data = calculate_grade(sessions)\n",
    "print_dataframe(grade_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Figure 11.6 Source Code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "        <style>img {width:125px; height:125px; }\n",
       "               tr {font-size:24px; text-align:left; font-weight:normal; }\n",
       "               td {padding-right:40px; text-align:left; }\n",
       "               th {font-size:28px; text-align:left; }\n",
       "               th:nth-child(4) {width:125px; }</style><h1>SDBN judgments for q=dryer</h><table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>grade</th>\n",
       "      <th>upc</th>\n",
       "      <th>image</th>\n",
       "      <th>name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.4118</td>\n",
       "      <td>856751002097</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/856751002097.jpg\"></td>\n",
       "      <td>Practecol - Dryer Balls (2-Pack)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.3924</td>\n",
       "      <td>48231011396</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/48231011396.jpg\"></td>\n",
       "      <td>LG - 3.5 Cu. Ft. 7-Cycle High-Efficiency Washer - White</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.3164</td>\n",
       "      <td>84691226727</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/84691226727.jpg\"></td>\n",
       "      <td>GE - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.2938</td>\n",
       "      <td>74108007469</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/74108007469.jpg\"></td>\n",
       "      <td>Conair - 1875-Watt Folding Handle Hair Dryer - Blue</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.2752</td>\n",
       "      <td>12505525766</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/12505525766.jpg\"></td>\n",
       "      <td>Smart Choice - 6' 30 Amp 3-Prong Dryer Cord</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "render_judged(products, calculate_grade(sessions), grade_col=\"grade\", label=f\"SDBN judgments for q={query}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Listing 11.10 Source Code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "              clicked  examined     grade                                 name\n",
      "doc_id                                                                        \n",
      "97360810042       412       642  0.641745  Transformers: Dark of the Moon -...\n",
      "400192926087       62       129  0.480620  Transformers: Dark of the Moon -...\n",
      "97363560449        96       243  0.395062  Transformers: Dark of the Moon -...\n",
      "97363532149        42       130  0.323077  Transformers: Revenge of the Fal...\n",
      "93624956037        41       154  0.266234  Transformers: Dark of the Moon -...\n",
      "47875842328       367      1531  0.239713  Transformers: Dark of the Moon S...\n",
      "47875841420       217       960  0.226042  Transformers: Dark of the Moon D...\n",
      "25192107191       176      1082  0.162662  Fast Five - Widescreen - Blu-ray...\n",
      "786936817218      118       777  0.151866  Pirates Of The Caribbean: On Str...\n",
      "786936817218      118       777  0.151866  Pirates of the Caribbean: On Str...\n",
      "36725235564        41       277  0.148014  Samsung - 40\" Class - LCD - 1080...\n",
      "24543701538       182      1232  0.147727  The A-Team - Widescreen Dubbed S...\n",
      "47875841369        37       251  0.147410  Transformers: Dark of the Moon -...\n",
      "47875841406        80       626  0.127796  Transformers: Dark of the Moon A...\n",
      "24543750949        31       313  0.099042  X-Men: First Class - Widescreen ...\n",
      "47875842335        53       681  0.077827  Transformers: Dark of the Moon S...\n"
     ]
    }
   ],
   "source": [
    "query = \"transformers dark of the moon\"\n",
    "sessions = get_sessions(query)\n",
    "grade_data = calculate_grade(sessions)\n",
    "print_dataframe(grade_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Figure 11.7 Source Code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "        <style>img {width:125px; height:125px; }\n",
       "               tr {font-size:24px; text-align:left; font-weight:normal; }\n",
       "               td {padding-right:40px; text-align:left; }\n",
       "               th {font-size:28px; text-align:left; }\n",
       "               th:nth-child(4) {width:125px; }</style><h1>SDBN judgments for q=transformers dark of the moon</h><table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>grade</th>\n",
       "      <th>upc</th>\n",
       "      <th>image</th>\n",
       "      <th>name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.6417</td>\n",
       "      <td>97360810042</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/97360810042.jpg\"></td>\n",
       "      <td>Transformers: Dark of the Moon - Blu-ray Disc</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.4806</td>\n",
       "      <td>400192926087</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/400192926087.jpg\"></td>\n",
       "      <td>Transformers: Dark of the Moon - Original Soundtrack - CD</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.3951</td>\n",
       "      <td>97363560449</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/97363560449.jpg\"></td>\n",
       "      <td>Transformers: Dark of the Moon - Widescreen Dubbed Subtitle - DVD</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.3231</td>\n",
       "      <td>97363532149</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/97363532149.jpg\"></td>\n",
       "      <td>Transformers: Revenge of the Fallen - Widescreen Dubbed Subtitle - DVD</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.2662</td>\n",
       "      <td>93624956037</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/93624956037.jpg\"></td>\n",
       "      <td>Transformers: Dark of the Moon - Original Soundtrack - CD</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "render_judged(products, calculate_grade(sessions), grade_col=\"grade\", label=f\"SDBN judgments for q={query}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Up next: [Dealing with Low Confidence Situations](3.SDBN-Confidence-Bias.ipynb)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
